Dataset statistics
| Number of variables | 17 |
|---|---|
| Number of observations | 5228430 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 438.8 MiB |
| Average record size in memory | 88.0 B |
Variable types
| Categorical | 6 |
|---|---|
| DateTime | 1 |
| Numeric | 10 |
jockey has a high cardinality: 178 distinct values | High cardinality |
race_number is highly correlated with post_time | High correlation |
latitude is highly correlated with longitude | High correlation |
longitude is highly correlated with latitude | High correlation |
post_time is highly correlated with race_number | High correlation |
distance_id is highly correlated with weight_carried | High correlation |
weight_carried is highly correlated with distance_id | High correlation |
track_id is highly correlated with latitude and 1 other fields | High correlation |
race_number is highly correlated with post_time | High correlation |
trakus_index is highly correlated with distance_id and 1 other fields | High correlation |
latitude is highly correlated with track_id and 1 other fields | High correlation |
longitude is highly correlated with track_id and 1 other fields | High correlation |
distance_id is highly correlated with trakus_index and 3 other fields | High correlation |
course_type is highly correlated with trakus_index and 4 other fields | High correlation |
track_condition is highly correlated with course_type | High correlation |
run_up_distance is highly correlated with distance_id and 1 other fields | High correlation |
post_time is highly correlated with race_number | High correlation |
weight_carried is highly correlated with distance_id and 1 other fields | High correlation |
run_up_distance has 59344 (1.1%) zeros | Zeros |
Reproduction
| Analysis started | 2022-08-18 10:10:13.040521 |
|---|---|
| Analysis finished | 2022-08-18 10:14:37.992291 |
| Duration | 4 minutes and 24.95 seconds |
| Software version | pandas-profiling v3.2.0 |
| Download configuration | config.json |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.9 MiB |
| AQU | |
|---|---|
| BEL | |
| SAR |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 15685290 |
|---|---|
| Distinct characters | 8 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | AQU |
|---|---|
| 2nd row | AQU |
| 3rd row | AQU |
| 4th row | AQU |
| 5th row | AQU |
Common Values
| Value | Count | Frequency (%) |
| AQU | 2158369 | |
| BEL | 1947134 | |
| SAR | 1122927 |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| aqu | 2158369 | |
| bel | 1947134 | |
| sar | 1122927 |
Most occurring characters
| Value | Count | Frequency (%) |
| A | 3281296 | |
| Q | 2158369 | |
| U | 2158369 | |
| B | 1947134 | |
| E | 1947134 | |
| L | 1947134 | |
| S | 1122927 | 7.2% |
| R | 1122927 | 7.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 15685290 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 3281296 | |
| Q | 2158369 | |
| U | 2158369 | |
| B | 1947134 | |
| E | 1947134 | |
| L | 1947134 | |
| S | 1122927 | 7.2% |
| R | 1122927 | 7.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 15685290 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| A | 3281296 | |
| Q | 2158369 | |
| U | 2158369 | |
| B | 1947134 | |
| E | 1947134 | |
| L | 1947134 | |
| S | 1122927 | 7.2% |
| R | 1122927 | 7.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15685290 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| A | 3281296 | |
| Q | 2158369 | |
| U | 2158369 | |
| B | 1947134 | |
| E | 1947134 | |
| L | 1947134 | |
| S | 1122927 | 7.2% |
| R | 1122927 | 7.2% |
race_date
Date
| Distinct | 217 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.9 MiB |
| Minimum | 2019-01-01 00:00:00 |
|---|---|
| Maximum | 2019-12-31 00:00:00 |
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.505408889 |
| Minimum | 1 |
|---|---|
| Maximum | 13 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.0 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 6 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 13 |
| Range | 12 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 2.860655522 |
|---|---|
| Coefficient of variation (CV) | 0.5196081852 |
| Kurtosis | -0.980067865 |
| Mean | 5.505408889 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.07692640069 |
| Sum | 28784645 |
| Variance | 8.183350015 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 582413 | |
| 6 | 575217 | |
| 5 | 559884 | |
| 7 | 554623 | |
| 4 | 523846 | |
| 2 | 515115 | |
| 1 | 508808 | |
| 9 | 492915 | |
| 3 | 490256 | |
| 10 | 283141 | |
| Other values (3) | 142212 | 2.7% |
| Value | Count | Frequency (%) |
| 1 | 508808 | |
| 2 | 515115 | |
| 3 | 490256 | |
| 4 | 523846 | |
| 5 | 559884 | |
| 6 | 575217 | |
| 7 | 554623 | |
| 8 | 582413 | |
| 9 | 492915 | |
| 10 | 283141 |
| Value | Count | Frequency (%) |
| 13 | 11683 | 0.2% |
| 12 | 31725 | 0.6% |
| 11 | 98804 | 1.9% |
| 10 | 283141 | |
| 9 | 492915 | |
| 8 | 582413 | |
| 7 | 554623 | |
| 6 | 575217 | |
| 5 | 559884 | |
| 4 | 523846 |
program_number
Categorical
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.9 MiB |
| 4 | |
|---|---|
| 3 | |
| 5 | |
| 2 | |
| 1 | |
| Other values (15) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 15685290 |
|---|---|
| Distinct characters | 14 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 6 |
|---|---|
| 2nd row | 6 |
| 3rd row | 6 |
| 4th row | 6 |
| 5th row | 6 |
Common Values
| Value | Count | Frequency (%) |
| 4 | 611396 | |
| 3 | 611025 | |
| 5 | 606666 | |
| 2 | 602597 | |
| 1 | 599979 | |
| 6 | 579893 | |
| 7 | 486905 | |
| 8 | 373202 | |
| 9 | 266526 | |
| 10 | 186835 | 3.6% |
| Other values (10) | 303406 |
Length
| Value | Count | Frequency (%) |
| 4 | 611396 | |
| 3 | 611025 | |
| 5 | 606666 | |
| 2 | 602597 | |
| 1 | 599979 | |
| 6 | 579893 | |
| 7 | 486905 | |
| 8 | 373202 | |
| 9 | 266526 | |
| 10 | 186835 | 3.6% |
| Other values (10) | 303406 |
Most occurring characters
| Value | Count | Frequency (%) |
| 9966619 | ||
| 1 | 1203337 | 7.7% |
| 2 | 678041 | 4.3% |
| 3 | 637866 | 4.1% |
| 4 | 626477 | 4.0% |
| 5 | 614583 | 3.9% |
| 6 | 583484 | 3.7% |
| 7 | 486905 | 3.1% |
| 8 | 373202 | 2.4% |
| 9 | 266526 | 1.7% |
| Other values (4) | 248250 | 1.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Space Separator | 9966619 | |
| Decimal Number | 5657256 | |
| Uppercase Letter | 61415 | 0.4% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 1203337 | |
| 2 | 678041 | |
| 3 | 637866 | |
| 4 | 626477 | |
| 5 | 614583 | |
| 6 | 583484 | |
| 7 | 486905 | |
| 8 | 373202 | 6.6% |
| 9 | 266526 | 4.7% |
| 0 | 186835 | 3.3% |
Uppercase Letter
| Value | Count | Frequency (%) |
| A | 56753 | |
| B | 4081 | 6.6% |
| X | 581 | 0.9% |
Space Separator
| Value | Count | Frequency (%) |
| 9966619 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 15623875 | |
| Latin | 61415 | 0.4% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 9966619 | ||
| 1 | 1203337 | 7.7% |
| 2 | 678041 | 4.3% |
| 3 | 637866 | 4.1% |
| 4 | 626477 | 4.0% |
| 5 | 614583 | 3.9% |
| 6 | 583484 | 3.7% |
| 7 | 486905 | 3.1% |
| 8 | 373202 | 2.4% |
| 9 | 266526 | 1.7% |
Latin
| Value | Count | Frequency (%) |
| A | 56753 | |
| B | 4081 | 6.6% |
| X | 581 | 0.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15685290 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 9966619 | ||
| 1 | 1203337 | 7.7% |
| 2 | 678041 | 4.3% |
| 3 | 637866 | 4.1% |
| 4 | 626477 | 4.0% |
| 5 | 614583 | 3.9% |
| 6 | 583484 | 3.7% |
| 7 | 486905 | 3.1% |
| 8 | 373202 | 2.4% |
| 9 | 266526 | 1.7% |
| Other values (4) | 248250 | 1.6% |
| Distinct | 1062 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 183.8652125 |
| Minimum | 1 |
|---|---|
| Maximum | 1062 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.0 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 18 |
| Q1 | 88 |
| median | 176 |
| Q3 | 264 |
| 95-th percentile | 377 |
| Maximum | 1062 |
| Range | 1061 |
| Interquartile range (IQR) | 176 |
Descriptive statistics
| Standard deviation | 118.3326899 |
|---|---|
| Coefficient of variation (CV) | 0.6435838964 |
| Kurtosis | 2.233086316 |
| Mean | 183.8652125 |
| Median Absolute Deviation (MAD) | 88 |
| Skewness | 0.8361103605 |
| Sum | 961326393 |
| Variance | 14002.62549 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 14915 | 0.3% |
| 2 | 14915 | 0.3% |
| 3 | 14915 | 0.3% |
| 4 | 14915 | 0.3% |
| 8 | 14914 | 0.3% |
| 15 | 14914 | 0.3% |
| 16 | 14914 | 0.3% |
| 9 | 14914 | 0.3% |
| 10 | 14914 | 0.3% |
| 11 | 14914 | 0.3% |
| Other values (1052) | 5079286 |
| Value | Count | Frequency (%) |
| 1 | 14915 | |
| 2 | 14915 | |
| 3 | 14915 | |
| 4 | 14915 | |
| 5 | 14914 | |
| 6 | 14914 | |
| 7 | 14914 | |
| 8 | 14914 | |
| 9 | 14914 | |
| 10 | 14914 |
| Value | Count | Frequency (%) |
| 1062 | 9 | |
| 1061 | 9 | |
| 1060 | 9 | |
| 1059 | 9 | |
| 1058 | 9 | |
| 1057 | 9 | |
| 1056 | 9 | |
| 1055 | 9 | |
| 1054 | 9 | |
| 1053 | 9 |
| Distinct | 5224784 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 41.20386704 |
| Minimum | 40.6667108 |
|---|---|
| Maximum | 43.07399173 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 39.9 MiB |
Quantile statistics
| Minimum | 40.6667108 |
|---|---|
| 5-th percentile | 40.67021298 |
| Q1 | 40.67345603 |
| median | 40.71393361 |
| Q3 | 40.71739324 |
| 95-th percentile | 43.07299548 |
| Maximum | 43.07399173 |
| Range | 2.407280932 |
| Interquartile range (IQR) | 0.04393720806 |
Descriptive statistics
| Standard deviation | 0.9771226713 |
|---|---|
| Coefficient of variation (CV) | 0.02371434386 |
| Kurtosis | -0.07199889901 |
| Mean | 41.20386704 |
| Median Absolute Deviation (MAD) | 0.03990861543 |
| Skewness | 1.387718403 |
| Sum | 215431534.5 |
| Variance | 0.9547687148 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 43.07179234 | 98 | < 0.1% |
| 40.66672127 | 46 | < 0.1% |
| 43.06865706 | 31 | < 0.1% |
| 43.06869702 | 31 | < 0.1% |
| 43.068705 | 31 | < 0.1% |
| 43.06867304 | 31 | < 0.1% |
| 43.06868104 | 31 | < 0.1% |
| 43.06871299 | 31 | < 0.1% |
| 43.06866505 | 31 | < 0.1% |
| 43.06864907 | 31 | < 0.1% |
| Other values (5224774) | 5228038 |
| Value | Count | Frequency (%) |
| 40.6667108 | 2 | < 0.1% |
| 40.66671874 | 3 | < 0.1% |
| 40.66672066 | 4 | < 0.1% |
| 40.66672127 | 46 | |
| 40.66672224 | 3 | < 0.1% |
| 40.66672262 | 3 | < 0.1% |
| 40.66672633 | 3 | < 0.1% |
| 40.66672791 | 3 | < 0.1% |
| 40.66673098 | 4 | < 0.1% |
| 40.66673204 | 9 | < 0.1% |
| Value | Count | Frequency (%) |
| 43.07399173 | 1 | |
| 43.0739917 | 1 | |
| 43.07399097 | 1 | |
| 43.07398948 | 1 | |
| 43.07398943 | 1 | |
| 43.07398563 | 1 | |
| 43.07398517 | 1 | |
| 43.07398091 | 1 | |
| 43.07398008 | 1 | |
| 43.07397977 | 1 |
| Distinct | 5224799 |
|---|---|
| Distinct (%) | 99.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -73.77702389 |
| Minimum | -73.83260149 |
|---|---|
| Maximum | -73.7148265 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 5228430 |
| Negative (%) | 100.0% |
| Memory size | 39.9 MiB |
Quantile statistics
| Minimum | -73.83260149 |
|---|---|
| 5-th percentile | -73.83173468 |
| Q1 | -73.82883411 |
| median | -73.76914831 |
| Q3 | -73.72563624 |
| 95-th percentile | -73.72015512 |
| Maximum | -73.7148265 |
| Range | 0.1177749864 |
| Interquartile range (IQR) | 0.1031978737 |
Descriptive statistics
| Standard deviation | 0.04711054248 |
|---|---|
| Coefficient of variation (CV) | -0.0006385530344 |
| Kurtosis | -1.735802319 |
| Mean | -73.77702389 |
| Median Absolute Deviation (MAD) | 0.04751605643 |
| Skewness | -0.03086390547 |
| Sum | -385738005 |
| Variance | 0.002219403212 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| -73.76465767 | 98 | < 0.1% |
| -73.83021525 | 46 | < 0.1% |
| -73.77195764 | 31 | < 0.1% |
| -73.77196897 | 31 | < 0.1% |
| -73.77194633 | 31 | < 0.1% |
| -73.77192934 | 31 | < 0.1% |
| -73.77194066 | 31 | < 0.1% |
| -73.77191803 | 31 | < 0.1% |
| -73.77196331 | 31 | < 0.1% |
| -73.77192369 | 31 | < 0.1% |
| Other values (5224789) | 5228038 |
| Value | Count | Frequency (%) |
| -73.83260149 | 1 | |
| -73.83260146 | 1 | |
| -73.83260062 | 1 | |
| -73.8326005 | 1 | |
| -73.83259902 | 1 | |
| -73.83259817 | 1 | |
| -73.83259612 | 1 | |
| -73.83259494 | 1 | |
| -73.83259237 | 1 | |
| -73.83259065 | 1 |
| Value | Count | Frequency (%) |
| -73.7148265 | 1 | < 0.1% |
| -73.71489734 | 8 | |
| -73.7149337 | 1 | < 0.1% |
| -73.71493525 | 1 | < 0.1% |
| -73.71493911 | 1 | < 0.1% |
| -73.71495822 | 1 | < 0.1% |
| -73.71496318 | 7 | |
| -73.71496709 | 1 | < 0.1% |
| -73.71497419 | 1 | < 0.1% |
| -73.71497811 | 1 | < 0.1% |
| Distinct | 20 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 756.3151271 |
| Minimum | 450 |
|---|---|
| Maximum | 2000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.0 MiB |
Quantile statistics
| Minimum | 450 |
|---|---|
| 5-th percentile | 550 |
| Q1 | 600 |
| median | 800 |
| Q3 | 850 |
| 95-th percentile | 1000 |
| Maximum | 2000 |
| Range | 1550 |
| Interquartile range (IQR) | 250 |
Descriptive statistics
| Standard deviation | 179.6886889 |
|---|---|
| Coefficient of variation (CV) | 0.2375844175 |
| Kurtosis | 13.04437961 |
| Mean | 756.3151271 |
| Median Absolute Deviation (MAD) | 100 |
| Skewness | 2.659565266 |
| Sum | 3954340700 |
| Variance | 32288.02492 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 600 | 1194752 | |
| 800 | 1137856 | |
| 850 | 776832 | |
| 700 | 595436 | |
| 650 | 466190 | 8.9% |
| 900 | 416372 | 8.0% |
| 550 | 280966 | 5.4% |
| 1000 | 87941 | 1.7% |
| 1100 | 83223 | 1.6% |
| 1200 | 45213 | 0.9% |
| Other values (10) | 143649 | 2.7% |
| Value | Count | Frequency (%) |
| 450 | 2796 | 0.1% |
| 500 | 25270 | 0.5% |
| 550 | 280966 | 5.4% |
| 600 | 1194752 | |
| 650 | 466190 | 8.9% |
| 700 | 595436 | |
| 800 | 1137856 | |
| 850 | 776832 | |
| 900 | 416372 | 8.0% |
| 950 | 35814 | 0.7% |
| Value | Count | Frequency (%) |
| 2000 | 9558 | 0.2% |
| 1900 | 15315 | 0.3% |
| 1800 | 6713 | 0.1% |
| 1650 | 33963 | 0.6% |
| 1600 | 5904 | 0.1% |
| 1400 | 4655 | 0.1% |
| 1200 | 45213 | |
| 1100 | 83223 | |
| 1050 | 3661 | 0.1% |
| 1000 | 87941 |
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.9 MiB |
| D | |
|---|---|
| T | |
| I | |
| O | 193063 |
| M | 65549 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 5228430 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | D |
|---|---|
| 2nd row | D |
| 3rd row | D |
| 4th row | D |
| 5th row | D |
Common Values
| Value | Count | Frequency (%) |
| D | 3229234 | |
| T | 988274 | 18.9% |
| I | 752310 | 14.4% |
| O | 193063 | 3.7% |
| M | 65549 | 1.3% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| d | 3229234 | |
| t | 988274 | 18.9% |
| i | 752310 | 14.4% |
| o | 193063 | 3.7% |
| m | 65549 | 1.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| D | 3229234 | |
| T | 988274 | 18.9% |
| I | 752310 | 14.4% |
| O | 193063 | 3.7% |
| M | 65549 | 1.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 5228430 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| D | 3229234 | |
| T | 988274 | 18.9% |
| I | 752310 | 14.4% |
| O | 193063 | 3.7% |
| M | 65549 | 1.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5228430 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| D | 3229234 | |
| T | 988274 | 18.9% |
| I | 752310 | 14.4% |
| O | 193063 | 3.7% |
| M | 65549 | 1.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5228430 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| D | 3229234 | |
| T | 988274 | 18.9% |
| I | 752310 | 14.4% |
| O | 193063 | 3.7% |
| M | 65549 | 1.3% |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.9 MiB |
| FT | |
|---|---|
| FM | |
| GD | |
| SY | |
| MY | 172943 |
| Other values (2) | 97955 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 15685290 |
|---|---|
| Distinct characters | 9 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | GD |
|---|---|
| 2nd row | GD |
| 3rd row | GD |
| 4th row | GD |
| 5th row | GD |
Common Values
| Value | Count | Frequency (%) |
| FT | 2345748 | |
| FM | 1391865 | |
| GD | 733124 | 14.0% |
| SY | 486795 | 9.3% |
| MY | 172943 | 3.3% |
| YL | 89250 | 1.7% |
| SF | 8705 | 0.2% |
Length
Category Frequency Plot
| Value | Count | Frequency (%) |
| ft | 2345748 | |
| fm | 1391865 | |
| gd | 733124 | 14.0% |
| sy | 486795 | 9.3% |
| my | 172943 | 3.3% |
| yl | 89250 | 1.7% |
| sf | 8705 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 5228430 | ||
| F | 3746318 | |
| T | 2345748 | |
| M | 1564808 | 10.0% |
| Y | 748988 | 4.8% |
| G | 733124 | 4.7% |
| D | 733124 | 4.7% |
| S | 495500 | 3.2% |
| L | 89250 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 10456860 | |
| Space Separator | 5228430 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| F | 3746318 | |
| T | 2345748 | |
| M | 1564808 | |
| Y | 748988 | 7.2% |
| G | 733124 | 7.0% |
| D | 733124 | 7.0% |
| S | 495500 | 4.7% |
| L | 89250 | 0.9% |
Space Separator
| Value | Count | Frequency (%) |
| 5228430 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 10456860 | |
| Common | 5228430 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| F | 3746318 | |
| T | 2345748 | |
| M | 1564808 | |
| Y | 748988 | 7.2% |
| G | 733124 | 7.0% |
| D | 733124 | 7.0% |
| S | 495500 | 4.7% |
| L | 89250 | 0.9% |
Common
| Value | Count | Frequency (%) |
| 5228430 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15685290 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 5228430 | ||
| F | 3746318 | |
| T | 2345748 | |
| M | 1564808 | 10.0% |
| Y | 748988 | 4.8% |
| G | 733124 | 4.7% |
| D | 733124 | 4.7% |
| S | 495500 | 3.2% |
| L | 89250 | 0.6% |
| Distinct | 126 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 45.11571504 |
| Minimum | -128 |
|---|---|
| Maximum | 126 |
| Zeros | 59344 |
| Zeros (%) | 1.1% |
| Negative | 454007 |
| Negative (%) | 8.7% |
| Memory size | 5.0 MiB |
Quantile statistics
| Minimum | -128 |
|---|---|
| 5-th percentile | -92 |
| Q1 | 44 |
| median | 52 |
| Q3 | 72 |
| 95-th percentile | 101 |
| Maximum | 126 |
| Range | 254 |
| Interquartile range (IQR) | 28 |
Descriptive statistics
| Standard deviation | 48.31769345 |
|---|---|
| Coefficient of variation (CV) | 1.070972574 |
| Kurtosis | 4.389318633 |
| Mean | 45.11571504 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | -2.018172041 |
| Sum | 235884358 |
| Variance | 2334.5995 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 54 | 798776 | 15.3% |
| 45 | 396175 | 7.6% |
| 48 | 345963 | 6.6% |
| 52 | 235485 | 4.5% |
| 44 | 197209 | 3.8% |
| 72 | 186176 | 3.6% |
| 32 | 153818 | 2.9% |
| 40 | 141682 | 2.7% |
| 90 | 141586 | 2.7% |
| 50 | 106533 | 2.0% |
| Other values (116) | 2525027 |
| Value | Count | Frequency (%) |
| -128 | 13979 | 0.3% |
| -126 | 20732 | |
| -122 | 4664 | 0.1% |
| -121 | 38563 | |
| -120 | 5148 | 0.1% |
| -118 | 30326 | |
| -116 | 9101 | 0.2% |
| -114 | 24091 | |
| -112 | 20582 | |
| -108 | 7273 | 0.1% |
| Value | Count | Frequency (%) |
| 126 | 59268 | |
| 125 | 3104 | 0.1% |
| 124 | 3432 | 0.1% |
| 123 | 7248 | 0.1% |
| 122 | 2680 | 0.1% |
| 121 | 2944 | 0.1% |
| 120 | 2034 | < 0.1% |
| 118 | 1962 | < 0.1% |
| 117 | 14301 | 0.3% |
| 112 | 26659 |
race_type
Categorical
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.9 MiB |
| CLM | |
|---|---|
| MSW | |
| MCL | |
| STK | |
| AOC | |
| Other values (7) |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 15685290 |
|---|---|
| Distinct characters | 12 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | CLM |
|---|---|
| 2nd row | CLM |
| 3rd row | CLM |
| 4th row | CLM |
| 5th row | CLM |
Common Values
| Value | Count | Frequency (%) |
| CLM | 1142801 | |
| MSW | 1089307 | |
| MCL | 920650 | |
| STK | 730102 | |
| AOC | 541117 | |
| ALW | 522748 | |
| STR | 186744 | 3.6% |
| SOC | 36069 | 0.7% |
| SST | 27560 | 0.5% |
| WCL | 22386 | 0.4% |
| Other values (2) | 8946 | 0.2% |
Length
| Value | Count | Frequency (%) |
| clm | 1142801 | |
| msw | 1089307 | |
| mcl | 920650 | |
| stk | 730102 | |
| aoc | 541117 | |
| alw | 522748 | |
| str | 186744 | 3.6% |
| soc | 36069 | 0.7% |
| sst | 27560 | 0.5% |
| wcl | 22386 | 0.4% |
| Other values (2) | 8946 | 0.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| M | 3157018 | |
| C | 2667283 | |
| L | 2608585 | |
| S | 2102028 | |
| W | 1638701 | |
| A | 1063865 | 6.8% |
| T | 944406 | 6.0% |
| K | 730102 | 4.7% |
| O | 577186 | 3.7% |
| R | 186744 | 1.2% |
| Other values (2) | 9372 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 15685290 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 3157018 | |
| C | 2667283 | |
| L | 2608585 | |
| S | 2102028 | |
| W | 1638701 | |
| A | 1063865 | 6.8% |
| T | 944406 | 6.0% |
| K | 730102 | 4.7% |
| O | 577186 | 3.7% |
| R | 186744 | 1.2% |
| Other values (2) | 9372 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 15685290 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| M | 3157018 | |
| C | 2667283 | |
| L | 2608585 | |
| S | 2102028 | |
| W | 1638701 | |
| A | 1063865 | 6.8% |
| T | 944406 | 6.0% |
| K | 730102 | 4.7% |
| O | 577186 | 3.7% |
| R | 186744 | 1.2% |
| Other values (2) | 9372 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 15685290 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| M | 3157018 | |
| C | 2667283 | |
| L | 2608585 | |
| S | 2102028 | |
| W | 1638701 | |
| A | 1063865 | 6.8% |
| T | 944406 | 6.0% |
| K | 730102 | 4.7% |
| O | 577186 | 3.7% |
| R | 186744 | 1.2% |
| Other values (2) | 9372 | 0.1% |
purse
Real number (ℝ≥0)
| Distinct | 70 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 88878.90778 |
| Minimum | 16000 |
|---|---|
| Maximum | 1500000 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 19.9 MiB |
Quantile statistics
| Minimum | 16000 |
|---|---|
| 5-th percentile | 28000 |
| Q1 | 41000 |
| median | 62000 |
| Q3 | 80000 |
| 95-th percentile | 200000 |
| Maximum | 1500000 |
| Range | 1484000 |
| Interquartile range (IQR) | 39000 |
Descriptive statistics
| Standard deviation | 127824.0255 |
|---|---|
| Coefficient of variation (CV) | 1.438181777 |
| Kurtosis | 42.94515382 |
| Mean | 88878.90778 |
| Median Absolute Deviation (MAD) | 20000 |
| Skewness | 5.974790005 |
| Sum | 4.646971478 × 1011 |
| Variance | 1.63389815 × 1010 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 41000 | 353152 | 6.8% |
| 62000 | 281264 | 5.4% |
| 70000 | 279714 | 5.3% |
| 75000 | 250906 | 4.8% |
| 90000 | 242703 | 4.6% |
| 60000 | 234699 | 4.5% |
| 28000 | 228702 | 4.4% |
| 80000 | 208012 | 4.0% |
| 100000 | 195429 | 3.7% |
| 55000 | 181012 | 3.5% |
| Other values (60) | 2772837 |
| Value | Count | Frequency (%) |
| 16000 | 8291 | 0.2% |
| 20000 | 8030 | 0.2% |
| 22000 | 8628 | 0.2% |
| 24000 | 4784 | 0.1% |
| 25000 | 18138 | 0.3% |
| 28000 | 228702 | |
| 30000 | 66496 | 1.3% |
| 31000 | 6463 | 0.1% |
| 32000 | 59805 | 1.1% |
| 33000 | 138581 |
| Value | Count | Frequency (%) |
| 1500000 | 5520 | 0.1% |
| 1250000 | 5568 | 0.1% |
| 1200000 | 3258 | 0.1% |
| 1000000 | 22844 | |
| 850000 | 5040 | 0.1% |
| 750000 | 25145 | |
| 700000 | 19404 | |
| 600000 | 11914 | 0.2% |
| 500000 | 32488 | |
| 400000 | 34385 |
| Distinct | 360 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 420.4248074 |
| Minimum | 100 |
|---|---|
| Maximum | 1259 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.0 MiB |
Quantile statistics
| Minimum | 100 |
|---|---|
| 5-th percentile | 125 |
| Q1 | 233 |
| median | 349 |
| Q3 | 516 |
| 95-th percentile | 1230 |
| Maximum | 1259 |
| Range | 1159 |
| Interquartile range (IQR) | 283 |
Descriptive statistics
| Standard deviation | 280.2544999 |
|---|---|
| Coefficient of variation (CV) | 0.6665983903 |
| Kurtosis | 2.912988755 |
| Mean | 420.4248074 |
| Median Absolute Deviation (MAD) | 126 |
| Skewness | 1.754678604 |
| Sum | 2198161676 |
| Variance | 78542.58471 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 130 | 94327 | 1.8% |
| 1250 | 94087 | 1.8% |
| 223 | 61610 | 1.2% |
| 551 | 56106 | 1.1% |
| 445 | 53938 | 1.0% |
| 412 | 52381 | 1.0% |
| 202 | 50544 | 1.0% |
| 518 | 49133 | 0.9% |
| 100 | 46225 | 0.9% |
| 308 | 45920 | 0.9% |
| Other values (350) | 4624159 |
| Value | Count | Frequency (%) |
| 100 | 46225 | |
| 101 | 8926 | 0.2% |
| 102 | 37246 | |
| 103 | 19977 | |
| 104 | 6638 | 0.1% |
| 105 | 2198 | < 0.1% |
| 108 | 1710 | < 0.1% |
| 109 | 1698 | < 0.1% |
| 115 | 23589 | |
| 116 | 9350 | 0.2% |
| Value | Count | Frequency (%) |
| 1259 | 17295 | 0.3% |
| 1258 | 7655 | 0.1% |
| 1257 | 1848 | < 0.1% |
| 1256 | 11315 | 0.2% |
| 1255 | 20626 | 0.4% |
| 1254 | 4554 | 0.1% |
| 1252 | 4956 | 0.1% |
| 1251 | 17554 | 0.3% |
| 1250 | 94087 | |
| 1249 | 8985 | 0.2% |
| Distinct | 35 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 120.7132914 |
| Minimum | 110 |
|---|---|
| Maximum | 160 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.0 MiB |
Quantile statistics
| Minimum | 110 |
|---|---|
| 5-th percentile | 115 |
| Q1 | 118 |
| median | 120 |
| Q3 | 122 |
| 95-th percentile | 125 |
| Maximum | 160 |
| Range | 50 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 4.351880945 |
|---|---|
| Coefficient of variation (CV) | 0.03605138172 |
| Kurtosis | 25.79531531 |
| Mean | 120.7132914 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 3.737747874 |
| Sum | 631140994 |
| Variance | 18.93886776 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 118 | 805184 | |
| 120 | 768432 | |
| 119 | 747592 | |
| 122 | 628713 | |
| 124 | 510510 | |
| 121 | 457490 | |
| 123 | 384078 | |
| 125 | 204079 | 3.9% |
| 126 | 133556 | 2.6% |
| 117 | 126486 | 2.4% |
| Other values (25) | 462310 |
| Value | Count | Frequency (%) |
| 110 | 328 | < 0.1% |
| 111 | 25915 | 0.5% |
| 112 | 18884 | 0.4% |
| 113 | 61619 | 1.2% |
| 114 | 82722 | 1.6% |
| 115 | 88757 | 1.7% |
| 116 | 115239 | 2.2% |
| 117 | 126486 | 2.4% |
| 118 | 805184 | |
| 119 | 747592 |
| Value | Count | Frequency (%) |
| 160 | 1062 | < 0.1% |
| 158 | 1062 | < 0.1% |
| 156 | 9624 | |
| 155 | 643 | < 0.1% |
| 154 | 1864 | < 0.1% |
| 153 | 8026 | |
| 152 | 6472 | |
| 150 | 4539 | |
| 149 | 1918 | < 0.1% |
| 148 | 3593 | 0.1% |
| Distinct | 178 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 39.9 MiB |
| Manuel Franco | |
|---|---|
| Dylan Davis | |
| Jose Lezcano | 344802 |
| Junior Alvarado | 299324 |
| Irad Ortiz Jr. | 282044 |
| Other values (173) |
Length
| Max length | 26 |
|---|---|
| Median length | 22 |
| Mean length | 14.10091901 |
| Min length | 8 |
Characters and Unicode
| Total characters | 73725668 |
|---|---|
| Distinct characters | 51 |
| Distinct categories | 5 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Andre Shivnarine Worrie |
|---|---|
| 2nd row | Andre Shivnarine Worrie |
| 3rd row | Andre Shivnarine Worrie |
| 4th row | Andre Shivnarine Worrie |
| 5th row | Andre Shivnarine Worrie |
Common Values
| Value | Count | Frequency (%) |
| Manuel Franco | 444024 | 8.5% |
| Dylan Davis | 381018 | 7.3% |
| Jose Lezcano | 344802 | 6.6% |
| Junior Alvarado | 299324 | 5.7% |
| Irad Ortiz Jr. | 282044 | 5.4% |
| Eric Cancel | 248857 | 4.8% |
| Jose L. Ortiz | 247944 | 4.7% |
| Kendrick Carmouche | 246683 | 4.7% |
| Joel Rosario | 232221 | 4.4% |
| Luis Saez | 231061 | 4.4% |
| Other values (168) | 2270452 |
Length
| Value | Count | Frequency (%) |
| jose | 598981 | 4.9% |
| jr | 546373 | 4.5% |
| ortiz | 529988 | 4.4% |
| manuel | 454841 | 3.7% |
| franco | 444024 | 3.7% |
| r | 424404 | 3.5% |
| luis | 400512 | 3.3% |
| davis | 383597 | 3.2% |
| dylan | 381018 | 3.1% |
| lezcano | 344802 | 2.8% |
| Other values (308) | 7627810 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 7693318 | 10.4% |
| 6907920 | 9.4% | |
| e | 6426641 | 8.7% |
| r | 5788644 | 7.9% |
| n | 4584054 | 6.2% |
| o | 4449303 | 6.0% |
| i | 4065752 | 5.5% |
| l | 3143444 | 4.3% |
| J | 2442535 | 3.3% |
| z | 2305874 | 3.1% |
| Other values (41) | 25918183 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 53142084 | |
| Uppercase Letter | 12166427 | 16.5% |
| Space Separator | 6907920 | 9.4% |
| Other Punctuation | 1508408 | 2.0% |
| Dash Punctuation | 829 | < 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 7693318 | |
| e | 6426641 | |
| r | 5788644 | |
| n | 4584054 | |
| o | 4449303 | |
| i | 4065752 | 7.7% |
| l | 3143444 | 5.9% |
| z | 2305874 | 4.3% |
| s | 2232138 | 4.2% |
| c | 2223535 | 4.2% |
| Other values (16) | 10229381 |
Uppercase Letter
| Value | Count | Frequency (%) |
| J | 2442535 | |
| R | 1364989 | |
| L | 1123479 | |
| C | 1005867 | |
| D | 963403 | 7.9% |
| M | 874165 | 7.2% |
| H | 560294 | 4.6% |
| O | 543815 | 4.5% |
| S | 511289 | 4.2% |
| F | 475400 | 3.9% |
| Other values (12) | 2301191 |
Space Separator
| Value | Count | Frequency (%) |
| 6907920 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 1508408 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 829 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 65308511 | |
| Common | 8417157 | 11.4% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 7693318 | 11.8% |
| e | 6426641 | 9.8% |
| r | 5788644 | 8.9% |
| n | 4584054 | 7.0% |
| o | 4449303 | 6.8% |
| i | 4065752 | 6.2% |
| l | 3143444 | 4.8% |
| J | 2442535 | 3.7% |
| z | 2305874 | 3.5% |
| s | 2232138 | 3.4% |
| Other values (38) | 22176808 |
Common
| Value | Count | Frequency (%) |
| 6907920 | ||
| . | 1508408 | 17.9% |
| - | 829 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 73725668 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 7693318 | 10.4% |
| 6907920 | 9.4% | |
| e | 6426641 | 8.7% |
| r | 5788644 | 7.9% |
| n | 4584054 | 6.2% |
| o | 4449303 | 6.0% |
| i | 4065752 | 5.5% |
| l | 3143444 | 4.3% |
| J | 2442535 | 3.3% |
| z | 2305874 | 3.1% |
| Other values (41) | 25918183 |
odds
Real number (ℝ≥0)
| Distinct | 656 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1482.919877 |
| Minimum | 0 |
|---|---|
| Maximum | 19100 |
| Zeros | 978 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 10.0 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 115 |
| Q1 | 335 |
| median | 730 |
| Q3 | 1740 |
| 95-th percentile | 5675 |
| Maximum | 19100 |
| Range | 19100 |
| Interquartile range (IQR) | 1405 |
Descriptive statistics
| Standard deviation | 1952.415229 |
|---|---|
| Coefficient of variation (CV) | 1.316601969 |
| Kurtosis | 9.102398076 |
| Mean | 1482.919877 |
| Median Absolute Deviation (MAD) | 490 |
| Skewness | 2.701129674 |
| Sum | 7753342770 |
| Variance | 3811925.225 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 420 | 45315 | 0.9% |
| 430 | 43158 | 0.8% |
| 460 | 42417 | 0.8% |
| 410 | 40648 | 0.8% |
| 440 | 38635 | 0.7% |
| 400 | 38563 | 0.7% |
| 550 | 37430 | 0.7% |
| 520 | 37132 | 0.7% |
| 510 | 36374 | 0.7% |
| 490 | 35402 | 0.7% |
| Other values (646) | 4833356 |
| Value | Count | Frequency (%) |
| 0 | 978 | < 0.1% |
| 5 | 293 | < 0.1% |
| 15 | 2497 | < 0.1% |
| 20 | 3292 | 0.1% |
| 25 | 3805 | |
| 30 | 5142 | |
| 35 | 6939 | |
| 40 | 7563 | |
| 45 | 9170 | |
| 50 | 7556 |
| Value | Count | Frequency (%) |
| 19100 | 409 | < 0.1% |
| 17425 | 314 | < 0.1% |
| 16900 | 409 | < 0.1% |
| 16800 | 321 | < 0.1% |
| 15275 | 1074 | |
| 15000 | 404 | < 0.1% |
| 14850 | 275 | < 0.1% |
| 14550 | 289 | < 0.1% |
| 14175 | 304 | < 0.1% |
| 14125 | 310 | < 0.1% |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| track_id | race_date | race_number | program_number | trakus_index | latitude | longitude | distance_id | course_type | track_condition | run_up_distance | race_type | purse | post_time | weight_carried | jockey | odds | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AQU | 2019-01-01 | 9 | 6 | 72 | 40.672902 | -73.827607 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
| 1 | AQU | 2019-01-01 | 9 | 6 | 73 | 40.672946 | -73.827587 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
| 2 | AQU | 2019-01-01 | 9 | 6 | 74 | 40.672990 | -73.827568 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
| 3 | AQU | 2019-01-01 | 9 | 6 | 63 | 40.672510 | -73.827781 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
| 4 | AQU | 2019-01-01 | 9 | 6 | 64 | 40.672553 | -73.827762 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
| 5 | AQU | 2019-01-01 | 9 | 6 | 65 | 40.672596 | -73.827742 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
| 6 | AQU | 2019-01-01 | 9 | 6 | 66 | 40.672640 | -73.827723 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
| 7 | AQU | 2019-01-01 | 9 | 6 | 67 | 40.672683 | -73.827703 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
| 8 | AQU | 2019-01-01 | 9 | 6 | 68 | 40.672726 | -73.827684 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
| 9 | AQU | 2019-01-01 | 9 | 6 | 57 | 40.672243 | -73.827903 | 600 | D | GD | 48 | CLM | 25000 | 420 | 120 | Andre Shivnarine Worrie | 2090 |
Last rows
| track_id | race_date | race_number | program_number | trakus_index | latitude | longitude | distance_id | course_type | track_condition | run_up_distance | race_type | purse | post_time | weight_carried | jockey | odds | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 5228420 | AQU | 2019-11-23 | 9 | 2 | 174 | 40.672080 | -73.830989 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |
| 5228421 | AQU | 2019-11-23 | 9 | 2 | 175 | 40.672038 | -73.831008 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |
| 5228422 | AQU | 2019-11-23 | 9 | 2 | 176 | 40.671996 | -73.831027 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |
| 5228423 | AQU | 2019-11-23 | 9 | 2 | 177 | 40.671955 | -73.831045 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |
| 5228424 | AQU | 2019-11-23 | 9 | 2 | 166 | 40.672403 | -73.830834 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |
| 5228425 | AQU | 2019-11-23 | 9 | 2 | 167 | 40.672363 | -73.830853 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |
| 5228426 | AQU | 2019-11-23 | 9 | 2 | 168 | 40.672321 | -73.830873 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |
| 5228427 | AQU | 2019-11-23 | 9 | 2 | 169 | 40.672281 | -73.830893 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |
| 5228428 | AQU | 2019-11-23 | 9 | 2 | 170 | 40.672240 | -73.830913 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |
| 5228429 | AQU | 2019-11-23 | 9 | 2 | 171 | 40.672200 | -73.830932 | 1100 | T | GD | 72 | STK | 200000 | 353 | 124 | Joel Rosario | 1120 |